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1_ .  Introduction. 

Let  y^  <  y2  <  . . .  <  yn  be  an  ordered  sample  trom  the 
situation  F(-~i)  .  In  this  report  we  are  concerned  about  inference 
with  regard  to  the  scale  parameter  a.  From  the  beginning  we  will 
restrict  attention  to  location- invar iant  and  scale-equivar iant 
statistics  S,  i.e. 


S(s(t?  +  j?) )  -  s  S(^)  . 


remark :  On  the  two-dimensional  class  ot  samples 

?(s,t)  *  s(t?  +  c?)  the  statistic  S  is  known  it  only  S(<?)  is 

Prepared  in  part  in  connection  with  research  at  Princeton  Universi 
ty,  supported  by  the  Army  Research  (Durham).  The  computing  tacili 
ties  were  provided  by  the  Department  ot  Energy,  Contract  DE-AC02 
81ER1034 1 . 
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tixed.  We  call  the  representing  element  ?  a  cont iguration  (i 


Morgenthaler  (1983a)). 


To  simplify  the  situation,  we  will  transform  our  parameter  space  by 


r  ■  log  (<y)  . 


It  is  well  known  that  this  transformation  symmetrizes  the 
distributions  involved  (see  e.g.  Bartlett  and  Kendall  (1946)). 
Furthermore  it  is  of  mathematical  convenience. 


For  a  t-estimator  T(  )  we  now  require 


T(s(t?  +  c? )  )  »  T(c?)  ♦  log(s) 


This  is  the  starting  point  tor  our  discussion.  In  the  next  chapter 
we  will  derive  the  conditional  confidence  distribution  and  examine 
the  resulting  strong  confidence  intervals.  The  third  chapter  will 
be  devoted  to  the  study  of  poly-  and  bi-optimal  confidence  interval 
procedures . 

2.  Compromising  between  the  Gaussian  and  the  slash:  Stron 


confidence  intervals. 


2.1.  Introduction 


Conditioned  on  any  given  configuration  c  ,  the  distribution  of 
T(  )  is  determined  by  the  distribution  of  log(s)  under  the  situation 
F  we  sample  from.  The  choice  T(^)  acts  like  a  location  parameter  to 
an  otherwise  tixed  distribution.  This  implies  that  the  conditional 
variance  is  not  at  all  influenced  by  our  choice  of  T(?)  —  whatever 
we  choose,  inside  the  configuration  the  variability  will  be  tixed. 
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For  setting  contidence  limits  we  are  interested  in  the  distribution 
ot  log(s)  conditioned  on  the  confcigur ation  as  well. 

Let  dSp(x|yfO’,^)  denote  the  conditional  density  ot  log(s)  given  the 
contiguration  c  under  sampling  trom  F(-~r)  •  Then  we  have 

oo 

ds_(x|*i#<y»^)  -  /  exk(ex  ,t|p,<y,^)  dt.  (2.1) 

-oo 

where  k(s,t|  p,<r,*)  is  the  conditional  density  expressed  in  terms 
ot  the  contiguration  parameters  s  and  t  given  we  are  in 
contiguration  c  and  the  underlying  parameter  values  are  and 
a  . 

proot : 

dsp( x I p ,o-f ^ )  *  3^P[log(s)  <  x | p ,ar,<? ] 


oo  e 


•37/  f  k(s,t|p,0',c?)  ds  dt, 
-00  0 


It  then  tollows  that 


dSp(x\ )  »  dsF(x-tlO,l,^)  (2.2) 

where  t:  *  log  (a).  This  is  a  consequence  ot  a  simple  change  ot 
variables  (see  Morgenthaler  (1983a)) 

OO 

dsp(x|  ■  J*  exk(ex  ,ttp,<r,cf)  dt 

-00 


-  7«xk(ex^ql0,l,?)^ 
-00 

«  dsp(x-rlo , if^ ) . 
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Now  we  know,  what  ettects  changes  in  the  parameter  values  jj  and  a 


have.  The  location  parameter  *1  has  no  ettect  at  all,  whereas  the 
scale  parameter  tixes  the  location  ot  dsp(  ) ,  which  is  otherwise 
unchang ed . 

This  shows  us  that  the  t-estimation  problem  is  a  location- type 
problem  with  known  scale. 

remark :  The  ettects  ot  changing  the  class-representing  cont iguration 
c  are  as  tollows: 

dSp( x I  0 , 1  ,<?  +  w t)  *  dsF(x)  0,l,c?) 

dSp(x|  0  ,  l,v^)  ■  dSp(x+log(v)  |0,1,^) 

where  w  G  R  and  v  G  R+ 

2.2.  Single  situation  case :  known  shape  F 


It  we  choose 

1(c)  *  -aveptlog(s)  «  -r-aveF  [log  ( s)  1 0 , 1 ,?] 

tor  arbitrary  values  ot  p  and  a  we  will  have 

avep[Tl*i*  ■  avep[T(?)  +  log(s)  l*i*  •&*,*)  * 

T(*)  +  avep[log(s)  I0,l,<?]+r*  -  r*  -  v 

where  t  *  log(<7)  and  r  ■  log(c  ).  Any  ot  these  choices  ot  T(^) 
leads  theretore  to  estimators  whose  overall  mean  is  equal  to  all  the 
conditional  means,  i.e.  it  is  not  functionally  dependent  on  £ .  Its 
variance  is  theretore  the  average  ot  the  conditional  variances  which 
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—  as  we  have  noticed  above  --  are  tixed  and  can  not  be  intluenced 
by  choosing  another  value  tor  T(<?).  This  estimate  tor  any  choice  ot 
<r  has  theretore  the  minimal  possible  variance. 

There  is  an  intinite  class  ot  t  -  estimators  with  smallest 
variance.  The  ditterence  ot  two  such  estimators  is  constant. 

On  the  cr-scale  they  are  multiples  ot  each  other,  but  there  the 
behavior  is  more  complex. 

The  problem  is  in  one  way  simpler  than  the  location  point-estimation 
problem,  but  there  is  an  additional  ditticulty.  We  are  completely 
tree  in  choosing  the  standard  torm  F(  )  which  is  used  as  a  reterence 
to  describe  the  scaling.  In  this  sense  the  scale  parameter  ex  is  a 
relative  parameter ,  describing  the  scale  relative  to  a  standard 
torm.  In  the  case  ot  the  location  parameter  p  we  were  able  to  escape 
this  ditticulty  by  restricting  attention  to  symmetric  shapes  and 
choosing  the  standard  torm  F(  )  such  that  the  center  ot  symmetry  is 
at  0. 

For  the  Gaussian  situation  we  could  adopt  such  an  escape  tor 
the  scale  parameter  too  and  tix  the  standard  torm  such  that  the 
variance  is  equal  to  1.  In  this  case  we  have  detined  a  target  -- 
the  standard  deviation  —  tor  our  estimator  ot  <y  and  it  makes  sense 
to  ask  tor  the  estimator  —  now  on  the  f  -  scale  —  which  is 
unbiased  and  has  smallest  variance.  In  order  to  be  unbiased  we  need 

avep [T 1 0 , 1 , ^ ]  ■  log(l)  »  0 

and  hence 
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T(C)  -  -aveFtlog(s) |0,l,c  ]. 

Setting  contidence  limits  is  straightforward  it  we  have  a  target  in 
mind.  It  U(  )  is  a  scale-equivar iant  upper  bound  tor  t,  i.e. 

U(s(?+tf})  -  log  ( s)  +  U(£>, 

we  are  concerned  about 

P[U  >  log  (a)  =  P[log(s)  +  U(cf)  >  v\n,cr,£)  * 

P  [log  ( s)  >  t-U  (c  )  ,cr ,& ] 


T  .  ds„(  x  |  p  ,a,c  )  dx 
r-lf(c) 


s  v  ds„( x-f 1 0 ,  1  ,c  )  dx 

1>uV)  F 


oo 

,fv  ds_(x|0,l,c  )  dx  »  1  - 
•U'(c) 


-U(^) 

/  dSp( x I  0 , 1  ,c  )  dx 


There  are  two  natural  choices,  the  balanced  and  the  conditionally 
shortest  choice  ot  upper  and  lower  bound.  The  length  ot  contidence 
intervals  conditioned  on  configurations  is  fixed,  since 

U(y)  -  L[f)  =  1  og  { s )  ♦  U(?)  -  log(s)  -  L(<?) 

-  U(^)  -  L(^) 

it  $  *  s(?+t?). 

For  the  balanced  contidence  interval  with  conditional  contidence 


level  100(l-c()%  we  take 

U(^)  -  -dsF(||0,l,^) 

L(^)  -  -dsF(l-||0,l,^) 

where  dsF(p|0,l,?)  is  defined  by 


September  21, 


-  7  - 

dsF(p|0,l,^ ) 

/  dsF(x|0,l,^ )  dx  -  p 

-oo 

Again  there  is  the  problem  ot  specifying  a  target.  We  have  seen  that 
there  is  an  infinite  class  ot  t  -  estimatiors  with  smallest 
variance.  Similarly  we  can  create  an  infinite  class  ot  100(1-4)% 
symmetric  confidence  intervals  by  moving  the  one  defined  in  (2.3)  by 
an  arbitrary  constant.  Ot  course  it  will  then  be  a  100(1-4)% 
confidence  interval  tor  a  different  target. 
remark :  The  Gaussian  case  (F  *  $) 

Using  (2.1)  with  the  standard  Gaussian  $ (  )  we  get 

00  *  *  n-l  e2x  n  ? 

ds. (x|0,l,c  )  is  prop,  to  J*  e  (e  )  exp( - y-  5  (t+c.)  )  dt 

•  -oo  2  i-1  1 


2x  n  _  -  oo  2x  _  2 

i.e.  prop,  to  e  exp( - y-  S  (c.-c)  )  X  exp( - =y— (t+c)  dt. 

£  i*l  1  -oo 


The  integral  in  the  last  line  is  proportional  to  and  hence 


2x  n 


ds.  (xl0,l,c^)  prop,  to  e(n_1)xexp(-  5  (c.-c)2) 
•  2  i-1  1 


V""1  o25t  n  ? 

prop,  to  (ei%)  exp{ - -  5  (c.-c)  )e  . 

z  i=l 

2 

This  we  recognize  as  the  distribution  ot  a  transform  ot  a  Xn_^ 
random  variable. 

It  X  has  the  density  dSj(xlO,l,^)  then  Y  *  e2X  has  the  density 
(Jacobian  ■  •—) 

n-l  , 

•*  — 1  v  0  -  2  1 

ty (yl 0, l,c  )  prop,  to  y  exp(-  <ci“c>  }  y  2y 


September  21,  1983 


scaled  by 


and  hence  a  X*~ 
n— 

constant  o t  ds^  (x I  0 , 1  ,c  ) 


This  gets  us  the  normalizing 


as 


n 


n-1 


r(Hli)2  2 


KT3  i-c)2,  2  . 


ds.(x|0,l,^)  has  two  interesting  properties 


[1]  It  contains  the  contigur ation  only  through  S 


5  (ci~c) 2 


[2]  S  only  attects  the  location  ot  ds^(x|0,l,^) 


[2]  implies  that  the  single  situation  confidence  intervals  in  the 
Gaussian  situation  will  all  be  ot  the  same  length  even  across 
configurations.  It  we  sample  from  the  Gaussian,  the  precision  ot  our 
knowledge  about  t  is  determined  by  the  sample  size  and  is  not 


dependent  on  the  point  pattern  ot  our  sample.  The  interval  bounds 

2  , 


(2.3)  are  the  usual  symmetric  X  intervals  transformed  to  the 
V  -  level . 


For  more  general  shapes  F({/)  the  above  is  not  true  and  the  length 
ot  the  single  situation  confidence  intervals  will  vary  from 
contiguration  to  configuration. 

The  Gaussi an  analysis  i s  in  tact  somewhat  naive.  The  Gaussian 
confidence  intervals  are  too  short  tor  a  heavy-tailed  situation  as 
tor  example  the  slash. 

The  single  situation  slash  intervals  are  unitormely,  i.e.  tor  all 
configurations,  longer  than  the  Gaussian  intervals. 
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2.3. 


The  two  situation  case:  Gaussian  and 


slash 


In  order  to  get  a  teeling  tor  the  problems  we  tace,  we  intend 
to  study  now  the  slash  behavior  ot  the  contidence  interval  tor  t 
based  on  a  "Gaussian  analysis" . 

In  order  to  compute  the  slash  coverage  probability  we  are  torced  to 
specity  which  parameter  we  want  to  cover.  It  we  choose  our  standard 
in  the  slash  tamily  as 


tr  (x) 


£_ —  ti  -  exp(-  -~] 
1  2r 


(2.4) 


2  2 

(2»rx^ 


values  tor  r  around  ^  ( t ^  ( 0 ) 

2 


(2s) 


as  tor  the  standard  Gaussian) 


make  sense.  From  (2.2)  we  know  ot  course  that  this  implies  just  a 
translation  ot  dssiash*  >• 


Now  we  have  identified  our  problem  as  one  ot  too  much  freedom. 
In  order  to  have  a  compatible  meaning  ot  a  " scale  parameter"  in  two 
di  tte rent  location  and  scale  families,  _i*£*  two  ditterent  shapes ,  we 
have  to  fix  the  relative  scale  between  the  two .  More  simply  put,  we 
have  to  specity  a  standard  distribution  in  each  tamily. 
remark :  There  are  obviously  several  ways  in  which  we  can  do  this 
matching  ot  families  (see:  Tuke y ( 1 930 ) ) .  It  we  restrict  attention  to 
shapes  with  finite  second  moment,  one  natural  choice  ot  the  standard 
form  is  a  member  ot  the  tamily  with  variance  1.  In  that  case  the 
target  ot  our  estimator  or  contidence  interval  is  the  standard 
deviation . 

Another  idea  is  the  matching  ot  percentiles  --  in  the  case  ot  the 
Gaussian  and  the  slash  tamily  this  leads  to  smaller  values  ot  r  it 


*  «  *  “  '  ►  '  w  *  •  *  '  «  ■  »  "  »  «  I  .  »  »  »  •  •  t 
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we  match  further  out  in  the  tail  (see:  Rogers  and  Tukey ( 1 972 ) ) . 
Finally  we  need  not  match  at  all.  We  can  study  estimators  like  the 
median  absolute  deviation  MAD  and  accept  whatever  "matching"  it 


imposes,  i.e.  accept  whatever  it  estimates  on  the  population  level. 

It  we  try  to  optimize  the  slash  coverage  probability  tor  the 

Gaussian-balanced  t  -  intervals,  we  are  lead  to  values  tor  r  around 

i,  which  corresponds  to  matching  the  97.5%  -  point.  The  maximal 

2  . 

slash  coverage  ot  the  usual  X  -intervals  we  can  achieve  in  this  way 
is  about  32%  tor  samples  ot  size  20  and  44%  tor  samples  ot  size  10. 
In  all  the  experimental  work  we  will  consider  only  these  two  sample 
sizes  and  leave  sample  size  5  aside. 

We  see  trom  the  above  numbers  how  short  the  Gaussian  intervals  are 
trom  the  slash  point  ot  view.  Furthermore  it  is  clear  that  r  *  |  is 
a  bad  choice,  since  it  concentrates  on  "extreme",  slash  drawn 
configurations  and  tries  to  make  Gaussian  estimation  compatible  to 
"slash  needs".  We  should  rather  try  to  choose  r  in  such  a  way  that 
the  slash  estimation  is  compatible  to  "Gaussian  needs"  on  "nicely 
behaved",  Gaussian-drawn  configurations.  In  that  way  we  might  hope 
that  the  slash  analysis  gives  about  the  right,  i.e.  compatible 
answer  on  Gaussian-drawn  samples  and  can  be  used  to  extrapolate  in  a 
sensible  way  to  configurations  containing  outliers,  where  the 
Gaussian  analysis  breaks  down  quickly. 

It  we  were  to  allow  a  cond i t i ona 1  choice  o t  £  conditioned  on 
each  configuration,  we  would  find  quite  large  differences  between 
configurations.  It  a  configuration  contains  outliers,  the  value  ot  r 
such  that  the  families  are  compatible  goes  down;  in  nicely  behave  I 
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contigurations  it  is  around  i.  In  point  estimation  this  caus 
ot  problems,  since  there  will  be  a  large  part  ot  the  varibiJ 
to  "conditional  bias",  which  we  cannot  escape. 

2.4.  Strong  cont idence  intervals  tor  t  =  log(a) 


In  this  section  we  want  to  study  the  possibilities  tor 

intervals  which,  conditioned  on  any  contiguration ,  reach  at 

100(1  — c( )  %  coverage  probability,  both  tor  the  Gaussian  and  tc 

slash.  For  each  contiguration  we  get  the  balanced  r  -  interv 

[ L  , U  ]  and  [ L  , U_]  tor  the  Gaussian  and  slash  situation  ( s« 
g  g  s  s 

(2.3)).  For  reasons  discussed  above,  we  are  tree  to  move  al) 
intervals  relative  to  each  other  by  a  tixed  constant.  We  wiJ 
this  by  holding  the  Gaussian  intervals  tixed  and  moving  the 
ones.  This  can  be  described  by  choosing  a  value  r  in  (2.4). 
Only  samples  ot  size  10  and  20  are  considered. 

It  turns  out  that  the  slash  intervals  are  longer  than  i 
Gaussians  in  each  contiguration  —  it  we  were  allowed  to  chc 
relative  scale  constant  conditioned  on  the  contiguration,  w< 
always  get  to  a  case  where  the  slash  interval  covers  the  Gai 
interval.  This  is  a  bit  like  the  contidence  intervals  tor  u 
samples  ot  size  5,  where  Student's  t  interval  "dominates"  t 
interval . 

A  simple  strong  interval  is  given  by 

L  *  min  { Lg  ,  L,.} 

U  *  max  {Ug,  Us) 


September  21,  1983 


But  now  we  have  a  relative  scale  constant  at  our  disposal.  Table  2.1 
contains  the  tractions  ot  contigur ations  tailing  into  the  classes 


(a) 

L  *  L 

& 

and  U 

=  Us 

(b) 

L  -  U, 

and  U 

*  Us 

(c) 

L  -  Ls 

and  U  = 

Us  • 

Table  2.1: 

Percentage  ot  cases  (b) , 

(c)  and 

(d) 

Gaussian  situation 

slash 

situation 

1  j  1 

|  7  ! 

(b) 

(c) 

(d) 

(b) 

(c) 

(d) 

1  1 

1  2.6  | 

I  1 

82% 

17  ^% 

2. 

17  |% 

0% 

82  |% 

1  1 
|  2.8  | 

64% 

35  -|% 

3* 

18  |% 

2% 

79  -|% 

size* 20  J 

I  3.0  | 

|  | 

48% 

52% 

0% 

24  |% 

2  |% 

72  ^% 

1  1 

1  3.2  | 

1  1 

37  1% 

62-|% 

0% 

29 

3  — % 

57  |% 

1  1 

1  2.6  | 

1  1 

93  3% 

0% 

6  |% 

38% 

0% 

52% 

1  1 

1  2.8  | 

85  |% 

10% 

4  f% 

42% 

0% 

58% 

size=10  *  * 

1  1 

1  1 

1  3.0  | 

| 

78% 

1  8-j% 

4  5  -|% 

2. 

T% 

54% 

3.2  1 

73  i% 

26% 

!» 

4  8  |% 

2% 

49  i% 

r  is  as  in  (2.4) 

(b)  :  slash  dominates 

(c)  :  Gaussian  low,  slash 

(d)  :  slash  low,  Gaussian 

high 
hig  h . 

All  these  percentages  are  based  on  150  sampled  contigurations. 

The  two  situations  behave  ditterently.  In  slash-drawn  contigurations 
the  Gaussian  interval  otten  supplies  the  upper  bound  --  more 


September  21,  1933 


prominently  so  tor  samples  ot  size  20.  Ot  course  we  expect  this 
behavior  which  shows  how  much  outliers  influence  the  "Gaussian 
analysis".  In  most  ot  the  Gaussian-drawn  configurations,  the  slash 
intervals  dominate  the  Gaussian  intervals.  We  learn  that  the  two 
situations  favor  different  choices  ot  the  relative  scale  constant 
low  tor  the  Gaussian  and  high  tor  the  slash.  Table  2.2  contains 
expected  lengths  tor  the  above  strong  confidence  interval 
procedures . 


val  s 


si ze=20 


si ze®10 


estimated  expected  lengths 

tor  strong  contii 

1 

"r 

Gaussian 

situation 

slash  situation 

2.6 

1.08 

(.65) 

2.06 

(.72) 

2.8 

1.10 

(.69) 

2.00 

(.67) 

3.0 

1.13 

(.74) 

1.95 

(.63) 

3.2 

1.16 

(.78) 

1.91 

(.59) 

single 

0.65 

(.00) 

1.20 

(.00) 

2.6 

1.63 

(.66) 

2.41 

(.32) 

2.8 

1.63 

(.66) 

2.37 

(.30) 

3.0 

1.64 

(.67) 

2.33 

(.28) 

3.2 

1.65 

(  .68) 

2.29 

(.26) 

single 

0.98 

(.00) 

1.82 

(.00) 

length 


The  numbers  in  parenthesis  are  ( gr^Te  Ti tuat i orTsho r t e it } ~ 1 '  i*e‘ 
the  mean  length  deficiencies  and  the  row  labelled  "single"  contains 
the  length  from  the  single  situation  balanced  intervals. 


Figure  2.1  plots  the  mean  length  deficiencies  given  in  Table  2.2. 

1 


"x"  marks  the  points  tor  —  *  2.6,  2.8,  3.0  and  3.2  in  samples  ot 
size  20,  "o"  in  samples  ot  size  10. 

j  *  2.8  seems  a  good  choice  tor  the  two  sample  sizes,  in  "size*20" 
it  is  roughly  minimax,  in  "size*10"  it  roughly  minimizes  the 
Gaussian  deficiency  (the  minimum  is  rather  flat).  In  Figure  2.1  we 
see  how  the  strong  confidence  intervals  tor  r  lose  a  lot  in  the 
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square  mean  length  deficiencies  of  the  bi-optimal  intervals  for 


Gaussian 


Gaussian  situation  due  to  the  shortness  ot  the  X  -interval ,  which  is 
the  Gaussian  single  situation  choice.  As  the  sample  size  decreases, 
the  slash  interval  more  and  more  dominates  the  Gaussian  one  in  the 
slash  situation  {see  Table  2.1).  In  Figure  2.1  we  notice  that  the 
strong  intervals  are  really  quite  good  in  the  slash  situation  tor 
samples  ot  size  10.  The  choice  ~  *  2.8  seems  reasonable  trom  what  we 
have  just  said.  In  the  case  ot  the  smaller  sample  size  (10)  it 
minimizes  the  Gaussian  loss,  in  the  case  ot  larger  samples  (20)  it 
balances  the  losses  in  the  Gaussian  and  the  slash.  In  comparison  to 
location-parameter  intervals  the  two  situations  under  consideration 
exchange  places.  Now  the  Gaussian  based  intervals  are  optimistically 
short  and  the  slash  ones  are  long.  As  the  sample  size  decreases,  the 
slash  intervals  dominate  more  prominently. 

The  relatively  big  slash  loss  in  samples  ot  size  20  is 
puzzling.  It  is  due  to  the  tact  that  the  strong  intervals  described 
above  otten  are  "empty"  in  the  center  part  tor  slash-drawn 
contigurations ,  i.e.  the  two  single  situation  intervals  are 
separated  by  a  gap  *  -  Us ,  which  has  a  chance  ot  happening 

whenever  the  contiguration  tails  into  class  (d)  .  For  p-  *  2.8  such  a 
gap  occurs  in  42%  ot  the  slash-drawn  contigurations  tor  samples  ot 
size  20  and  in  19%  tor  samples  ot  size  10.  This  is  a  problem  which 
did  not  occur  in  the  case  ot  contidence  intervals  tor  a  location 
parameter.  There  the  strong  intervals  might  have  been  "overlong" 
when  judged  by  the  slash  situation.  But  here  the  problem  is  that 
neither  ot  the  two  situations  really  "needs"  these  gaps,  they  are 
"empty".  It  we  measure  the  percentage  ot  the  total  length  which  is 
empty,  we  tind  that  tor  samples  ot  size  20  as  much  as  75%  ot  the 
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total  conditional  length  can  be  made  up  by  empty  space  and  tor  about 
20%  ot  all  slash-drawn  contigurations  the  percentage  ot  "emptyness" 
is  above  ot  the  total  length.  For  samples  ot  size  10  this  peculiar 
problem  is  not  so  grave  —  about  4%  ot  all  slash-drawn 
contigurations  are  above  y  empty. 

The  gap  problem  we  have  discussed  above  results  trom  an 
incompatibility  ot  the  meaning  ot  the  Gaussian  and  the  slash  scale 
parameters  we  have  chosen.  In  contigurations  with  outliers,  the 
"Gaussian  model”  breaks  down  and  it  can  no  longer  be  connected  with 
the  "slash  model"  in  a  sensible  way.  We  noticed  this  in  the  case  ot 
contidence  intervals  tor  a  location  parameter,  but  it  is  even  more 
prominent  when  we  discuss  the  scale  parameter. 

For  the  purpose  ot  application,  the  strong  intervals  tor  a 
scale  parameter  as  given  above  are  not  a  helptul  description  ot  what 
is  going  on.  We  need  a  detinition  at  the  meaning  ot  the  scale 
parameter  not  guided  by  one  shape  (usually  the  Gaussian)  tor  all 
cont igurations ,  but  rather  splicing  together  "meanings"  guided  by 
ditterent  shapes.  In  the  center  section  ot  dpp(  )  (the  margingal 
density  across  contigurations  induced  by  sampling  trom  shape  F)  the 
shape  F  determines  the  meaning  ot  the  scale  parameter.  Between  the 
shapes  there  will  be  a  problem  or  relative  scaling  similar  to  the 
one  we  have  encountered  in  the  case  "Gaussian  and  slash”.  Solving 
this  solves  part  ot  the  splicing  problem. 

From  what  we  have  learned  about  samples  ot  size  20  and  10  we  can 

predict  what  is  happening  tor  size  5.  The  slash  intervals  tor  t  will 

2 

be  much  larger  than  the  Gaussian  X  -intervals  and  it  might  well  be 
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that  tor  j  around  2.8  the  slash  intervals  in  nearly  all 
configurations  contain  the  Gaussian  intervals.  The  strong  intervals 
then  would  conincide  with  the  slash  intervals. 


3.  Bi-shortest  confidence  intervals  tor  T  *  log (a) 

As  we  have  seen  in  the  previous  section  the  compromise  holding 
the  conditional  coverage  probabilities  fixed  is  not  practical.  In 
this  section  we  define  intervals  tor  T  which  adapt  better  to  the 
differences  in  single  situation  solutions  conditioned  on  the 
configurations  and  avoid  the  empty  space  we  encountered  in  the 
previously  discussed  procedure. 


Looking  tor  the  bi-shortest  interval  procedures  on  the  log(a)- 
scale  leads  to  a  problems  similar  to  the  location  parameter  case  as 
described  in  Morgenthaler  (1983b).  The  confidence  distribution  tor 
situation  F  conditioned  on  configuration  c  is 

"u  -> 

CoF(u)  »  1  -  f  dsF(x|0,l,c  )  dx  (3.1 

-oo 

with  density 

~u 

coF(u)  ■  f  dsF( (-u)  |0,l,c  ) 

—  oo 


(see  (2.3))  . 

The  bi-shortest  intervals  tor  t  given  the  shadow  prices  pg  and  pg 
are  given  by  the  solution  to 


h-i,vi  *  p»ws, 

k  \  k  v  k' 

•v,  ♦  <Vs 
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where  denotes  the  largest  solution  and  the  smallest  and 

k»l,...,N.  The  Lagrange  multipliers  ^g  and  )vs  are  adjusted  so  that 
both  overall  coverage  probabilities  are  at  least  100(l-d)%.  h^(  )  is 
the  mixture  oi  the  conditional  contidence  densities 


hk(  ) 


,\_w*co*(  ) 

.9  9  _3 _ _ 

\ 

* g  g 


♦  ,\sw^o^(  ) 


,\ 


k 
w 
s  s 


The  notations  and  ideas  are  the  same  as  in  Morgenthaler  (1983b) . 
Note,  however,  that  cog (  )  is  now  the  contidence  density  in  the 
Gaussian  situation  tor  the  parameter  f  »  log (a). 

The  solution  is  simpler  than  in  equations  (5.5)  where  we  had  to  use 


Eg(s|?k1  and  Esfs,^kJ 

to  adjust  tor  the  "scale”  ditterences  between  contigur ations .  This 
ditticulty  disappears  in  the  tr-case  since  --  as  we  saw  —  we 
basically  deal  with  a  location  problem  with  known  scale. 

We  believe  that  measuring  etticiency  by  expected  length  on  the 
logarithmic  scale,  i.e.  atter  transtorming  to  t  •  log  (o')  makes  at 
least  some  sense.  The  similar  procedures  on  the  original  scale,  i.e. 
tor  a,  are  less  desirable. 


2*A*  The  slash  single-situation  contidence  interval  procedure 


We  have  already  pointed  out  that  the  Gaussian  and  slash 
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situation  trade  places  it  we  move  from  p  to  t  (or  c r)  .  And  just  as 
Student's  t  interval  was  conservative  in  the  slash  situation,  we 
have  now  the  slash  single-situation  interval  procedures  which  are 
conservative  in  the  Gaussian  situation.  To  keep  things  simple  we 
will  restrict  attention  to  the  symmetric  slash  intervals  which  have 
tixed  conditional  confidence  coefficients  (note  that  we  have  fixed 
the  relative  scale  between  the  two  families  by  choosing  i  *  2.8  as 
in  the  previous  sectionsl).  This  _is  not  the  bi-shortest  confidence 

ps 

interval  procedure  with  shadow  price  ratio  —  *  oo ,  but  is  probably 

pg 

not  very  different  from  it. 

This  symmetric  interval  has  a  Gaussian  coverage  probability  of  96.2% 

and  98.6%  in  samples  of  size  20  and  10,  respectively.  The 

conditional  Gaussian  confidence  levels  are  most  of  the  time  very 

high  and  the  tail  towards  low  conditional  coverages  is  a  lot  thicker 

in  samples  of  size  20.  Of  course  this  interval  procedure  is  not 

balanced  it  judged  from  the  Gaussian  point  of  view.  We  can  see  this 

in  Table  2.1  where  the  columns  headed  (c)  and  (d)  show  a 

considerable  imbalance.  The  slash  single-situation  intervals  are 

frequently  too  much  to  the  right  and  hence  miss  the  true  t-value 

most  often  by  overshooting.  The  increase  in  expected  length  over  the 
2 

symmetric  X  -intervals  is  considerable.  The  expected  length  is 

2 

increased  by  about  a  factor  of  1-j  tor  both  sample  sizes. 

Just  as  Student's  t  interval  should  not  be  applied 
uncritically,  but  --  as  we  have  learned  --  can  be  modified 
successfully,  the  slash  single-situation  intervals  have  undesirable 
properties.  They  are  --  in  Ga ussian-drawn  configurations  --  often 


7-  .V 
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too  pessimistic  and  wasteful .  Introducing  the  Gaussian  expected 

length  along  with  the  slash  expected  length  hopefully  will  help  us 

to  find  procedures  which  correct  this  wastefulness.  But  we  must  face 

2 

the  need  tor  confidence  intervals  longer  than  the  common  X  -based 
ones .  In  the  next  section  the  slash  single-situation  interval  will 
be  used  as  a  means  of  comparison  to  indicate  our  progress. 

1.2.  The  bi-shortest  tr- interval  tor  the  shadow  price  ratio  _1 

Let  us  consider  the  bi-shortest  confidence  intervals  tor  the 
shadow  prices  *  pg  *  1  (see  (3.2)). 

Figure  3.1  shows  us  a  plot  of  the  resulting  conditional  expected 
lengths  vs.  the  conditional  expected  lengths  of  the  slash  single¬ 
situation  interval  discussed  in  the  previous  section.  All  plots  are 
based  on  a  sample  of  150  configurations.  The  upper  halt  shows  the 
samples  of  size  20,  the  lower  halt  the  samples  of  size  10.  In  both 
cases  we  are  indeed  able  to  shorten  —  and  hence  "save  some 
information"  —  in  the  Gaussian  situation.  Note,  however,  that  in 
samples  of  size  10  the  task  seems  to  be  more  difficult.  Only  in 
configurations  where  the  slash  single-situation  interval  is  short 
are  we  able  to  shorten  considerably.  In  samples  of  size  20,  the 
bi-shortest  are  quite  effectively  shortened.  In  the  slash  situation 
we  have,  of  course,  to  give  up  something.  Most  of  the  bi-shortest 
intervals  are  enlarged,  thus  balancing  the  configurations  where 
introducing  the  Gaussian  along  with  the  slash  leads  to  "erroneously" 
short  intervals. 

The  length  of  a  t-interval  conditioned  on  the  configuration  _i_s 
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t  ixed ,  _i.e.  constant ,  and  is,  turthermore ,  not  dependent  on  the 

underlying  situation.  It  therefore  reflects  a  property  of  the 

configuration,  which  we  can  interpret  as  conditional 

1 

"7  2 

(degrees  of  freedom)  .  The  X  -intervals  act  as  it  each 
configuration  had  the  same  number  of  degrees  of  freedom.  It  we 
introduce  the  slash  situation,  we  learn  that  this  cannot  be 
tolerated . 

In  Figure  3.1  we  see  how  the  ratio  1  confidence  procedure  recovers 
some  degrees  of  freedom  in  Gaussian-drawn  configurations  compared  to 
the  slash  single-situation  intervals. 

2 

On  the  average  we  lose  about  of  the  "Gaussian  degrees  of 

freedom"  tor  both  sample  sizes  —  a  bit  less  in  samples  of  size  20 

2 

—  b^  going  from  the  X  -intervals  to  the  slash  single-situation 
intervals.  Of  course  it  is  true  that  the  slash  situation  is  quite 
an  extreme  challenge  along  with  the  Gaussian,  but  degrees  of  freedom 
only  as  large  as  the  usual  Gaussian  degrees  of  freedom  is  not 
uncommon  (see  Gosset(1927)  especially  Table  III). 

The  bi-shortest  interval  procedure  tor  shadow  price  ratio  1 
recovers  most  of  that  loss  tor  Gaussian-drawn  configurations  in 
samples  of  size  20.  It  leads  to  a  loss  of  about  of  the  Gaussian 
degrees  of  freedom.  This  recovery,  of  course,  is  due  partly  to  the 
use  of  a  better  center  tor  the  confidence  intervals  (see  Figure 
3.2).  In  samples  of  size  10  we  still  --  even  with  the  bi-shortest 
intervals  —  lose  roughly  of  the  Gaussian  degrees  of  freedom  in 
Gaussian-drawn  configurations.  Again  we  see  that  a  compromise 
between  the  Gaussian  and  the  slash  situations  is  more  easily 
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possible  in  larger  samples.  In  contrast  to  the  location  parameter 
case  the  sample  sizes  10  and  20  are  now  tarther  apart  in  the  sense 
that  the  "Gaussian  loss"  is  considerable  in  samples  ot  size  10  no 
matter  what  we  do. 

The  above  discussion  shows  us  that  contidence  intervals  tor  tr 
—  or  tor  ar  —  need  to  be  enlarged  non- tr  iv ial ly  over  the  "pure 
Gaussian"  ones,  it  we  want  to  be  more  realistic  about  heavy¬ 
tailed  underlying  situations.  And  while  the  slash  situation  is 
certaily  an  extreme  challenge,  the  conclusions  are  by  no  means 
unreal istic . 

We  have  already  mentioned  the  importance  ot  the  center  ot  our 
t- interval s .  Figure  3.2  shows  us  tour  plots  ot  the  bi-shortest 
interval  centers  on  the  contiguration-scale .  For  Ga ussi an-drawn 
contigurations,  the  Gaussian  single-situation  interval  centers  serve 
as  comparison  values.  In  slash-drawn  contigurations,  the  slash 
symmetric  interval  centers  are  used  (again,  ot  course,  with  j  ~  2.8 
as  relative  scale  between  the  tamilies).  The  upper  halt  ot  the  plot 
shows  us  what  is  going  on  in  sample;  ot  size  20.  The  bi-shortest 
interval  with  shadow  price  ratio  1  has  a  center  very  nearly  the  same 
as  the  symmetric  slash  interval  in  slash-drawn  contigurations.  In 
the  Gaussian  situation  the  bi-shortest  interval  has  a  center  which 
is  slightly  and  almost  uniformly,  i.e.  tor  all  ot  the  sampled 
contigurations,  moved  towards  higher  values.  This  again  reflects  the 
tact  that  the  choice  -p  ■  2.8  is  already  too  large  it  judged  solely 
from  the  Gaussian  point  ot  view. 

In  samples  ot  size  10,  the  lower  halt  ot  Figure  3.2  shows  us 
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what  is  going  on.  Again  the  slash-drawn  configurations  b 
simple  way.  Playing  the  bi-shortest  game  only  attects  the 
(see  Figure  3.1)  but  not  --  or  only  marginally  —  the  cen 
confidence  intervals  tor  T  in  slash-drawn  configurations. 
For  Gaussi an-drawn  configurations  the  behavior  is  dittere 
samples  of  size  10.  Note  that  we  now  compare  to  the  Gauss 
sing le- si tuation  interval  centers.  Clearly  tor  some  conti 
the  bi-shortest  center  is  moved  upwards  as  it  was  tor  aim 
configurations  of  size  20.  But  tor  a  considerable  number 
Gaussian-drawn  configurations  the  bi-shortest  center  is  1 
Such  a  behavior  can  be  explained  by  the  tact  that  tor  siz 
distinction  between  Gaussian-drawn  and  slash-drawn  contig 
not  as  clear  as  it  was  tor  size  20. 

Note  that  the  interval  center  on  the  contiguration-scale 
the  length  —  is  not  the  conditional  expected  center. 

Finally,  we  ought  to  look  at  the  behavior  ot  the  con 
confidence  levels  ot  our  bi-shortest  interval  procedure, 
shows  the  slash  conditional  missing-probabilities  ot  the 
r-interval  with  shadow  price  ratio  1  in  samples  ot  size  ) 
points  in  this  plot  lie  around  the  diagonal.  This,  ot  cos. 
follows  from  what  we  have  seen  in  Figure  3.2  —  the  cents 
bi-shortest  intervals  are  near  to  the  centers  ot  the  slas 
intervals.  In  samples  ot  size  20,  the  picture  looks  somev 
"neat".  There  are  a  tew  configurations  where  the  bi-shor' 
interval  does  not  stretch  tar  enough  towards  high  r-valu< 

The  plot  ot  the  missing-probabilities  in  the  Gaussi. 
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looks  according  to  our  expectations.  Most  configurations  are  close 
to  the  point  (0,0),  i.e.  their  conditional  coverage  is  high.  From 
the  origin  a  tail  stretches  along  the  x-axis,  i.e.  it  the  bi- 
shortest  interval  has  a  low  Gaussian  conditional  coverage 
probability  this  is  due  to  the  tact  that  the  interval  does  not 
stretch  tar  enough  towards  low  t-values. 

3.2.1.  Thi  bi-optimal  curves 


Figure  3.4  shows  us  the  square  mean  length  deficiencies  ot  the 
bi-shortest  confidence  intervals  tor  t  in  samples  ot  size  10  and  20. 
These  deficiencies  are  defined  by 


deficiency^ I ) 


length  ot  I  in  situation 


minimal  exp.  length  in  situation  F 


)  -  1, 


where  tor  the  minimal  exp.  length  we  use  the  expected  length  ot  the 
single-situation  symmetric  intervals  for  t.  The  points  tor  size  20 
are  marked  by  "x",  the  ones  tor  size  10  by  "o". 

The  minimax  interval  procedure  in  size  20  has  an  efficiency  ot 
roughly  or  about  83%.  In  samples  ot  size  10  we  still  could  lower 
the  maximal  risk  by  lowering  the  shadow  price  ratio  —  the  Gaussian 
risk  is  dominating  the  picture  tor  this  smaller  sample  size  and  the 
more  we  can  push  the  Gaussian  risk  down,  the  better. 


The  strong  intervals  we  discussed  in  section  2  would  not  tit 
into  the  above  plot.  It  we  compare  Figure  3.4  with  corresponding 
figures  in  the  location  parameter  case  (see  Morgenthaler  (1983b)), 
it  is  obvious  that  we  are  no  longer  able  to  compromise  between  our 
two  situations  as  effectively  as  in  the  location  parameter  case. 
Figure  3.4  is  drawn  based  on  a  relative  scale  ot  -p  *  2.8  between  the 
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two  shapes  and  we  have  to  keep  in  mind  that  tor  any  choice  ot  the 


relative  scale  constant  there  is  a  ditterent  plot. 

4.  What  have  we  learned  about  cont idence  intervals  tor  a  scale 
parameter 

It  is  tair  to  say  that  the  area  ot  scale  estimation  has  not 
been  explored  in  a  detailed  way.  This  is  even  more  true  in  the 
interval-estimation  problem.  There  is  hardly  any  material  on  robust 
contidence  intervals  tor  a  scale  parameter  in  the  literature.  The 
present  report  closes  this  gap  to  some  extent,  but  clearly  more  work 
is  necessary  since  still  more  new  questions  are  raised  than  old 
questions  answered. 

Intervals  with  conditional  coverage  probability  ot  at  least 
100(1-4)%  in  both  situations  are  not  realistic  because  in  some 
cont igur ations  the  two  models  are  hard  to  put  under  one  hat  and  the 
"strong"  intervals  are  partly  "empty". 

The  bi-shortest  interval  procedures  are  a  better  compromise  tor 

the  two  underlying  situations.  They  detine  contidence  interval 

procedures  which  are  short  in  both  the  Gaussian  and  the  slash 

situation  and  reach  the  95%  contidence  level  tor  both  situations. 

2 

They  also  make  us  realize  that  the  Gaussian  X  -interval  is  too  short 
and  has  to  be  enlarged  it  we  want  to  have  a  procedure  which  is  sate 
even  in  heavy-tailed  situations. 

For  these  bi-shortest  procedures  we  need  to  understand  the  behavior 
in  other  situations  better.  And  it  is  also  not  entirely  clear  which 
criterion  should  be  adopted  tor  bi-optimization. 
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It  would  be  ot  interest  to  develop  other  robust  contidence 
intervals  tor  a  scale  parameter.  The  only  choices  available  now  seem 
to  be  based  on  jackknifing  or  bootstrapping  robust  scale  statistics 
like  the  hinge-spread  or  the  median-absolute-deviation  (see  Hoaglin 
et  al  (1983),  chapter  12  tor  further  discussion  on  robust  scale 
estimators).  None  ot  these  procedures  were  tried  out  in  this  thesis. 

The  scale  parameter  may  sometimes  be  the  primary  parameter  ot 
interest,  though  most  ot  the  time  it  will  be  a  nuisance  parameter. 

It  might  well  be  that  the  methods  developed  in  this  report  can  help 
us  in  setting  contidence  limits  tor  the  location  parameter.  This 
idea  has  not  been  explored  yet. 
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